Dataset statistics
| Number of variables | 25 |
|---|---|
| Number of observations | 77414 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 17.4 MiB |
| Average record size in memory | 235.3 B |
Variable types
| Numeric | 14 |
|---|---|
| Categorical | 4 |
| DateTime | 4 |
| Unsupported | 3 |
VALOR_A_PAGAR is highly overall correlated with valor_emprestimo and 1 other fields | High correlation |
RENDA_MES_ANTERIOR is highly overall correlated with diff_renda | High correlation |
meses_desde_cadastro is highly overall correlated with dias_desde_cadastro | High correlation |
dias_desde_cadastro is highly overall correlated with SAFRA_REF and 1 other fields | High correlation |
valor_emprestimo is highly overall correlated with VALOR_A_PAGAR and 1 other fields | High correlation |
diff_renda is highly overall correlated with VALOR_A_PAGAR and 2 other fields | High correlation |
tempo_sem_pagar is highly overall correlated with prazo_em_dias | High correlation |
inadimplente is highly overall correlated with tempo_sem_pagar | High correlation |
prazo_em_dias is highly overall correlated with tempo_sem_pagar | High correlation |
SAFRA_REF is highly overall correlated with NO_FUNCIONARIOS and 1 other fields | High correlation |
DDD is highly overall correlated with CEP_2_DIG | High correlation |
CEP_2_DIG is highly overall correlated with DDD | High correlation |
NO_FUNCIONARIOS is highly overall correlated with SAFRA_REF | High correlation |
tempo_sem_pagar is highly skewed (γ1 = -45.01485734) | Skewed |
prazo_em_dias is highly skewed (γ1 = 43.88720817) | Skewed |
SEGMENTO_INDUSTRIAL is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
DOMINIO_EMAIL is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
PORTE is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
DDD has 7583 (9.8%) zeros | Zeros |
RENDA_MES_ANTERIOR has 1570 (2.0%) zeros | Zeros |
NO_FUNCIONARIOS has 1767 (2.3%) zeros | Zeros |
tempo_sem_pagar has 60742 (78.5%) zeros | Zeros |
ultima_data_emprestimo has 9650 (12.5%) zeros | Zeros |
Reproduction
| Analysis started | 2022-12-16 15:51:10.456432 |
|---|---|
| Analysis finished | 2022-12-16 15:51:58.246011 |
| Duration | 47.79 seconds |
| Software version | pandas-profiling vv3.5.0 |
| Download configuration | config.json |
ID_CLIENTE
Real number (ℝ)
| Distinct | 1248 |
|---|---|
| Distinct (%) | 1.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.6622701 × 1018 |
| Minimum | 8.7842371 × 1015 |
|---|---|
| Maximum | 9.2060308 × 1018 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.2 MiB |
Quantile statistics
| Minimum | 8.7842371 × 1015 |
|---|---|
| 5-th percentile | 4.5929061 × 1017 |
| Q1 | 2.3693649 × 1018 |
| median | 4.817817 × 1018 |
| Q3 | 6.9693486 × 1018 |
| 95-th percentile | 8.6099387 × 1018 |
| Maximum | 9.2060308 × 1018 |
| Range | 9.1972466 × 1018 |
| Interquartile range (IQR) | 4.5999837 × 1018 |
Descriptive statistics
| Standard deviation | 2.6657194 × 1018 |
|---|---|
| Coefficient of variation (CV) | 0.57176425 |
| Kurtosis | -1.2274273 |
| Mean | 4.6622701 × 1018 |
| Median Absolute Deviation (MAD) | 2.3245111 × 1018 |
| Skewness | -0.080216523 |
| Sum | -4.0133142 × 1018 |
| Variance | 7.1060599 × 1036 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 6.96410875 × 1018 | 1151 | 1.5% |
| 5.761480994 × 1018 | 1055 | 1.4% |
| 4.008627435 × 1018 | 877 | 1.1% |
| 8.173830875 × 1018 | 675 | 0.9% |
| 6.916556752 × 1018 | 638 | 0.8% |
| 7.930925884 × 1018 | 585 | 0.8% |
| 4.592906068 × 1017 | 557 | 0.7% |
| 4.045739495 × 1018 | 539 | 0.7% |
| 7.325545719 × 1018 | 510 | 0.7% |
| 3.355881108 × 1018 | 505 | 0.7% |
| Other values (1238) | 70322 |
| Value | Count | Frequency (%) |
| 8.78423715 × 1015 | 241 | |
| 1.507004831 × 1016 | 5 | < 0.1% |
| 1.871961495 × 1016 | 7 | < 0.1% |
| 3.954702544 × 1016 | 70 | 0.1% |
| 4.326664122 × 1016 | 9 | < 0.1% |
| 4.963290558 × 1016 | 47 | 0.1% |
| 6.62200874 × 1016 | 61 | 0.1% |
| 6.97663625 × 1016 | 45 | 0.1% |
| 8.611006299 × 1016 | 93 | 0.1% |
| 8.643695504 × 1016 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 9.20603081 × 1018 | 104 | |
| 9.205015187 × 1018 | 26 | < 0.1% |
| 9.184785003 × 1018 | 121 | |
| 9.175443729 × 1018 | 6 | < 0.1% |
| 9.161263096 × 1018 | 36 | < 0.1% |
| 9.156666134 × 1018 | 36 | < 0.1% |
| 9.142266044 × 1018 | 9 | < 0.1% |
| 9.127318191 × 1018 | 57 | |
| 9.108733472 × 1018 | 64 | |
| 9.101617111 × 1018 | 19 | < 0.1% |
SAFRA_REF
Categorical
| Distinct | 35 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.2 MiB |
| 2021-05 | 2531 |
|---|---|
| 2021-06 | 2513 |
| 2020-11 | 2471 |
| 2019-10 | 2464 |
| 2019-12 | 2448 |
| Other values (30) |
Length
| Max length | 7 |
|---|---|
| Median length | 7 |
| Mean length | 7 |
| Min length | 7 |
Characters and Unicode
| Total characters | 541898 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2018-08 |
|---|---|
| 2nd row | 2018-08 |
| 3rd row | 2018-08 |
| 4th row | 2018-08 |
| 5th row | 2018-08 |
Common Values
| Value | Count | Frequency (%) |
| 2021-05 | 2531 | 3.3% |
| 2021-06 | 2513 | 3.2% |
| 2020-11 | 2471 | 3.2% |
| 2019-10 | 2464 | 3.2% |
| 2019-12 | 2448 | 3.2% |
| 2020-01 | 2414 | 3.1% |
| 2020-10 | 2402 | 3.1% |
| 2021-01 | 2393 | 3.1% |
| 2020-09 | 2382 | 3.1% |
| 2019-11 | 2377 | 3.1% |
| Other values (25) | 53019 |
Length
| Value | Count | Frequency (%) |
| 2021-05 | 2531 | 3.3% |
| 2021-06 | 2513 | 3.2% |
| 2020-11 | 2471 | 3.2% |
| 2019-10 | 2464 | 3.2% |
| 2019-12 | 2448 | 3.2% |
| 2020-01 | 2414 | 3.1% |
| 2020-10 | 2402 | 3.1% |
| 2021-01 | 2393 | 3.1% |
| 2020-09 | 2382 | 3.1% |
| 2019-11 | 2377 | 3.1% |
| Other values (25) | 53019 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 167486 | |
| 2 | 131314 | |
| 1 | 85794 | |
| - | 77414 | |
| 9 | 34166 | 6.3% |
| 8 | 14793 | 2.7% |
| 6 | 6796 | 1.3% |
| 5 | 6752 | 1.2% |
| 3 | 6557 | 1.2% |
| 4 | 6196 | 1.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 464484 | |
| Dash Punctuation | 77414 | 14.3% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 167486 | |
| 2 | 131314 | |
| 1 | 85794 | |
| 9 | 34166 | 7.4% |
| 8 | 14793 | 3.2% |
| 6 | 6796 | 1.5% |
| 5 | 6752 | 1.5% |
| 3 | 6557 | 1.4% |
| 4 | 6196 | 1.3% |
| 7 | 4630 | 1.0% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 77414 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 541898 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 167486 | |
| 2 | 131314 | |
| 1 | 85794 | |
| - | 77414 | |
| 9 | 34166 | 6.3% |
| 8 | 14793 | 2.7% |
| 6 | 6796 | 1.3% |
| 5 | 6752 | 1.2% |
| 3 | 6557 | 1.2% |
| 4 | 6196 | 1.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 541898 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 167486 | |
| 2 | 131314 | |
| 1 | 85794 | |
| - | 77414 | |
| 9 | 34166 | 6.3% |
| 8 | 14793 | 2.7% |
| 6 | 6796 | 1.3% |
| 5 | 6752 | 1.2% |
| 3 | 6557 | 1.2% |
| 4 | 6196 | 1.1% |
| Distinct | 1040 |
|---|---|
| Distinct (%) | 1.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.2 MiB |
| Minimum | 2018-08-17 00:00:00 |
|---|---|
| Maximum | 2021-06-30 00:00:00 |
DATA_PAGAMENTO
Date
| Distinct | 921 |
|---|---|
| Distinct (%) | 1.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.2 MiB |
| Minimum | 2018-06-19 00:00:00 |
|---|---|
| Maximum | 2021-11-24 00:00:00 |
DATA_VENCIMENTO
Date
| Distinct | 955 |
|---|---|
| Distinct (%) | 1.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.2 MiB |
| Minimum | 2017-11-27 00:00:00 |
|---|---|
| Maximum | 2027-03-31 00:00:00 |
VALOR_A_PAGAR
Real number (ℝ)
| Distinct | 68527 |
|---|---|
| Distinct (%) | 88.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 46565.461 |
| Minimum | 0.1 |
|---|---|
| Maximum | 4400000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.2 MiB |
Quantile statistics
| Minimum | 0.1 |
|---|---|
| 5-th percentile | 3234.859 |
| Q1 | 18752.745 |
| median | 34751.35 |
| Q3 | 60884.205 |
| 95-th percentile | 128377.11 |
| Maximum | 4400000 |
| Range | 4399999.9 |
| Interquartile range (IQR) | 42131.46 |
Descriptive statistics
| Standard deviation | 46338.921 |
|---|---|
| Coefficient of variation (CV) | 0.99513503 |
| Kurtosis | 1153.2635 |
| Mean | 46565.461 |
| Median Absolute Deviation (MAD) | 18119.175 |
| Skewness | 16.283832 |
| Sum | 3.6048186 × 109 |
| Variance | 2.1472956 × 109 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1182 | 618 | 0.8% |
| 999 | 246 | 0.3% |
| 945.6 | 171 | 0.2% |
| 360 | 130 | 0.2% |
| 1341 | 93 | 0.1% |
| 591 | 73 | 0.1% |
| 1063.8 | 54 | 0.1% |
| 499 | 50 | 0.1% |
| 1064 | 40 | 0.1% |
| 300 | 31 | < 0.1% |
| Other values (68517) | 75908 |
| Value | Count | Frequency (%) |
| 0.1 | 1 | |
| 0.4 | 1 | |
| 0.45 | 1 | |
| 0.7 | 1 | |
| 5.17 | 1 | |
| 5.5 | 1 | |
| 5.78 | 1 | |
| 5.95 | 1 | |
| 6 | 1 | |
| 6.22 | 1 |
| Value | Count | Frequency (%) |
| 4400000 | 1 | |
| 2250000 | 1 | |
| 1697544.07 | 1 | |
| 1500000 | 1 | |
| 1391835.2 | 1 | |
| 1325000 | 1 | |
| 1210000 | 1 | |
| 1200000 | 1 | |
| 1160000 | 1 | |
| 1000000 | 1 |
TAXA
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.2 MiB |
| 5.99 | |
|---|---|
| 6.99 | |
| 4.99 | |
| 8.99 | |
| 11.99 |
Length
| Max length | 5 |
|---|---|
| Median length | 4 |
| Mean length | 4.0684243 |
| Min length | 4 |
Characters and Unicode
| Total characters | 314953 |
|---|---|
| Distinct characters | 7 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 6.99 |
|---|---|
| 2nd row | 6.99 |
| 3rd row | 6.99 |
| 4th row | 6.99 |
| 5th row | 6.99 |
Common Values
| Value | Count | Frequency (%) |
| 5.99 | 26459 | |
| 6.99 | 22021 | |
| 4.99 | 15703 | |
| 8.99 | 7934 | 10.2% |
| 11.99 | 5297 | 6.8% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 5.99 | 26459 | |
| 6.99 | 22021 | |
| 4.99 | 15703 | |
| 8.99 | 7934 | 10.2% |
| 11.99 | 5297 | 6.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| 9 | 154828 | |
| . | 77414 | |
| 5 | 26459 | 8.4% |
| 6 | 22021 | 7.0% |
| 4 | 15703 | 5.0% |
| 1 | 10594 | 3.4% |
| 8 | 7934 | 2.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 237539 | |
| Other Punctuation | 77414 | 24.6% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 9 | 154828 | |
| 5 | 26459 | 11.1% |
| 6 | 22021 | 9.3% |
| 4 | 15703 | 6.6% |
| 1 | 10594 | 4.5% |
| 8 | 7934 | 3.3% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 77414 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 314953 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 9 | 154828 | |
| . | 77414 | |
| 5 | 26459 | 8.4% |
| 6 | 22021 | 7.0% |
| 4 | 15703 | 5.0% |
| 1 | 10594 | 3.4% |
| 8 | 7934 | 2.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 314953 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 9 | 154828 | |
| . | 77414 | |
| 5 | 26459 | 8.4% |
| 6 | 22021 | 7.0% |
| 4 | 15703 | 5.0% |
| 1 | 10594 | 3.4% |
| 8 | 7934 | 2.5% |
DATA_CADASTRO
Date
| Distinct | 732 |
|---|---|
| Distinct (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.2 MiB |
| Minimum | 2000-08-15 00:00:00 |
|---|---|
| Maximum | 2021-06-23 00:00:00 |
| Distinct | 75 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 37.859754 |
| Minimum | 0 |
|---|---|
| Maximum | 99 |
| Zeros | 7583 |
| Zeros (%) | 9.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.2 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 12 |
| median | 33 |
| Q3 | 62 |
| 95-th percentile | 91 |
| Maximum | 99 |
| Range | 99 |
| Interquartile range (IQR) | 50 |
Descriptive statistics
| Standard deviation | 27.760574 |
|---|---|
| Coefficient of variation (CV) | 0.73324761 |
| Kurtosis | -0.8790892 |
| Mean | 37.859754 |
| Median Absolute Deviation (MAD) | 22 |
| Skewness | 0.45143865 |
| Sum | 2930875 |
| Variance | 770.64948 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 11 | 8583 | 11.1% |
| 0 | 7583 | 9.8% |
| 31 | 3030 | 3.9% |
| 19 | 2741 | 3.5% |
| 21 | 2672 | 3.5% |
| 41 | 2415 | 3.1% |
| 71 | 2364 | 3.1% |
| 62 | 2324 | 3.0% |
| 43 | 2219 | 2.9% |
| 12 | 1611 | 2.1% |
| Other values (65) | 41872 |
| Value | Count | Frequency (%) |
| 0 | 7583 | |
| 1 | 389 | 0.5% |
| 2 | 212 | 0.3% |
| 3 | 18 | < 0.1% |
| 4 | 257 | 0.3% |
| 5 | 86 | 0.1% |
| 6 | 300 | 0.4% |
| 7 | 18 | < 0.1% |
| 8 | 191 | 0.2% |
| 9 | 143 | 0.2% |
| Value | Count | Frequency (%) |
| 99 | 611 | |
| 98 | 636 | |
| 95 | 224 | 0.3% |
| 94 | 653 | |
| 93 | 420 | |
| 92 | 362 | 0.5% |
| 91 | 1026 | |
| 88 | 457 | |
| 87 | 301 | 0.4% |
| 86 | 519 |
FLAG_PF
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.2 MiB |
| 0 | |
|---|---|
| 1 | 219 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 77414 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 77195 | |
| 1 | 219 | 0.3% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 77195 | |
| 1 | 219 | 0.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 77195 | |
| 1 | 219 | 0.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 77414 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 77195 | |
| 1 | 219 | 0.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 77414 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 77195 | |
| 1 | 219 | 0.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 77414 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 77195 | |
| 1 | 219 | 0.3% |
CEP_2_DIG
Real number (ℝ)
| Distinct | 90 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 53.320653 |
| Minimum | 0 |
|---|---|
| Maximum | 99 |
| Zeros | 8 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.2 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 13 |
| Q1 | 29 |
| median | 54 |
| Q3 | 79 |
| 95-th percentile | 95 |
| Maximum | 99 |
| Range | 99 |
| Interquartile range (IQR) | 50 |
Descriptive statistics
| Standard deviation | 27.876572 |
|---|---|
| Coefficient of variation (CV) | 0.52281003 |
| Kurtosis | -1.412881 |
| Mean | 53.320653 |
| Median Absolute Deviation (MAD) | 25 |
| Skewness | 0.012713365 |
| Sum | 4127765 |
| Variance | 777.10327 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 13 | 3888 | 5.0% |
| 35 | 2845 | 3.7% |
| 68 | 2603 | 3.4% |
| 89 | 2588 | 3.3% |
| 86 | 2135 | 2.8% |
| 37 | 2107 | 2.7% |
| 12 | 2076 | 2.7% |
| 78 | 2067 | 2.7% |
| 75 | 1773 | 2.3% |
| 38 | 1673 | 2.2% |
| Other values (80) | 53659 |
| Value | Count | Frequency (%) |
| 0 | 8 | < 0.1% |
| 11 | 914 | 1.2% |
| 12 | 2076 | |
| 13 | 3888 | |
| 14 | 1381 | 1.8% |
| 15 | 1245 | 1.6% |
| 16 | 728 | 0.9% |
| 17 | 751 | 1.0% |
| 18 | 624 | 0.8% |
| 19 | 837 | 1.1% |
| Value | Count | Frequency (%) |
| 99 | 841 | |
| 98 | 1081 | |
| 97 | 490 | |
| 96 | 682 | |
| 95 | 1181 | |
| 94 | 213 | 0.3% |
| 93 | 902 | |
| 92 | 296 | 0.4% |
| 91 | 163 | 0.2% |
| 90 | 498 |
| Distinct | 18249 |
|---|---|
| Distinct (%) | 23.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 285130.1 |
| Minimum | 0 |
|---|---|
| Maximum | 1682759 |
| Zeros | 1570 |
| Zeros (%) | 2.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.2 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 31618 |
| Q1 | 126902 |
| median | 235330 |
| Q3 | 391492 |
| 95-th percentile | 703707 |
| Maximum | 1682759 |
| Range | 1682759 |
| Interquartile range (IQR) | 264590 |
Descriptive statistics
| Standard deviation | 214887.8 |
|---|---|
| Coefficient of variation (CV) | 0.75364825 |
| Kurtosis | 2.4978554 |
| Mean | 285130.1 |
| Median Absolute Deviation (MAD) | 124555 |
| Skewness | 1.3428007 |
| Sum | 2.2073062 × 1010 |
| Variance | 4.6176767 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1570 | 2.0% |
| 120152 | 91 | 0.1% |
| 168672 | 72 | 0.1% |
| 118073 | 55 | 0.1% |
| 415262 | 51 | 0.1% |
| 703707 | 51 | 0.1% |
| 293852 | 49 | 0.1% |
| 260432 | 45 | 0.1% |
| 444595 | 45 | 0.1% |
| 466768 | 42 | 0.1% |
| Other values (18239) | 75343 |
| Value | Count | Frequency (%) |
| 0 | 1570 | |
| 105 | 8 | < 0.1% |
| 154 | 2 | < 0.1% |
| 216 | 7 | < 0.1% |
| 258 | 19 | < 0.1% |
| 352 | 6 | < 0.1% |
| 402 | 2 | < 0.1% |
| 531 | 1 | < 0.1% |
| 549 | 1 | < 0.1% |
| 632 | 4 | < 0.1% |
| Value | Count | Frequency (%) |
| 1682759 | 5 | |
| 1646635 | 4 | |
| 1634789 | 5 | |
| 1622248 | 4 | |
| 1614315 | 7 | |
| 1613297 | 3 | |
| 1592917 | 1 | < 0.1% |
| 1583291 | 4 | |
| 1498812 | 2 | < 0.1% |
| 1478508 | 1 | < 0.1% |
| Distinct | 128 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 115.21209 |
| Minimum | 0 |
|---|---|
| Maximum | 198 |
| Zeros | 1767 |
| Zeros (%) | 2.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.2 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 84 |
| Q1 | 105 |
| median | 117 |
| Q3 | 130 |
| 95-th percentile | 147 |
| Maximum | 198 |
| Range | 198 |
| Interquartile range (IQR) | 25 |
Descriptive statistics
| Standard deviation | 25.020656 |
|---|---|
| Coefficient of variation (CV) | 0.21717039 |
| Kurtosis | 8.0513549 |
| Mean | 115.21209 |
| Median Absolute Deviation (MAD) | 13 |
| Skewness | -2.0341903 |
| Sum | 8919029 |
| Variance | 626.03321 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 117 | 1859 | 2.4% |
| 116 | 1837 | 2.4% |
| 122 | 1836 | 2.4% |
| 120 | 1790 | 2.3% |
| 0 | 1767 | 2.3% |
| 111 | 1691 | 2.2% |
| 121 | 1689 | 2.2% |
| 118 | 1627 | 2.1% |
| 112 | 1613 | 2.1% |
| 125 | 1607 | 2.1% |
| Other values (118) | 60098 |
| Value | Count | Frequency (%) |
| 0 | 1767 | |
| 60 | 5 | < 0.1% |
| 61 | 5 | < 0.1% |
| 62 | 1 | < 0.1% |
| 63 | 6 | < 0.1% |
| 64 | 10 | < 0.1% |
| 65 | 4 | < 0.1% |
| 66 | 16 | < 0.1% |
| 67 | 37 | < 0.1% |
| 68 | 7 | < 0.1% |
| Value | Count | Frequency (%) |
| 198 | 1 | < 0.1% |
| 187 | 2 | < 0.1% |
| 186 | 1 | < 0.1% |
| 185 | 7 | |
| 182 | 7 | |
| 181 | 6 | |
| 180 | 1 | < 0.1% |
| 179 | 4 | |
| 178 | 7 | |
| 177 | 8 |
| Distinct | 317 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -0.17142894 |
| Minimum | -2661 |
|---|---|
| Maximum | 869 |
| Zeros | 60742 |
| Zeros (%) | 78.5% |
| Negative | 7906 |
| Negative (%) | 10.2% |
| Memory size | 1.2 MiB |
Quantile statistics
| Minimum | -2661 |
|---|---|
| 5-th percentile | -3 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 6 |
| Maximum | 869 |
| Range | 3530 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 25.229477 |
|---|---|
| Coefficient of variation (CV) | -147.17163 |
| Kurtosis | 3650.6049 |
| Mean | -0.17142894 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -45.014857 |
| Sum | -13271 |
| Variance | 636.5265 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 60742 | |
| -1 | 2841 | 3.7% |
| 1 | 1898 | 2.5% |
| 5 | 955 | 1.2% |
| -3 | 815 | 1.1% |
| 7 | 718 | 0.9% |
| 6 | 700 | 0.9% |
| 2 | 602 | 0.8% |
| -2 | 583 | 0.8% |
| 3 | 525 | 0.7% |
| Other values (307) | 7035 | 9.1% |
| Value | Count | Frequency (%) |
| -2661 | 1 | |
| -2070 | 1 | |
| -1896 | 2 | |
| -1303 | 1 | |
| -1297 | 1 | |
| -1284 | 1 | |
| -1229 | 1 | |
| -1224 | 1 | |
| -1183 | 1 | |
| -1105 | 1 |
| Value | Count | Frequency (%) |
| 869 | 1 | |
| 541 | 1 | |
| 522 | 1 | |
| 458 | 1 | |
| 400 | 1 | |
| 379 | 1 | |
| 370 | 1 | |
| 365 | 1 | |
| 331 | 1 | |
| 329 | 1 |
inadimplente
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.2 MiB |
| 0 | |
|---|---|
| 1 | 5436 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 77414 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 1 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 71978 | |
| 1 | 5436 | 7.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 71978 | |
| 1 | 5436 | 7.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 71978 | |
| 1 | 5436 | 7.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 77414 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 71978 | |
| 1 | 5436 | 7.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 77414 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 71978 | |
| 1 | 5436 | 7.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 77414 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 71978 | |
| 1 | 5436 | 7.0% |
| Distinct | 244 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 23.320575 |
| Minimum | -420 |
|---|---|
| Maximum | 2677 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 27 |
| Negative (%) | < 0.1% |
| Memory size | 1.2 MiB |
Quantile statistics
| Minimum | -420 |
|---|---|
| 5-th percentile | 16 |
| Q1 | 16 |
| median | 18 |
| Q3 | 24 |
| 95-th percentile | 45 |
| Maximum | 2677 |
| Range | 3097 |
| Interquartile range (IQR) | 8 |
Descriptive statistics
| Standard deviation | 26.137018 |
|---|---|
| Coefficient of variation (CV) | 1.1207707 |
| Kurtosis | 3168.4273 |
| Mean | 23.320575 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 43.887208 |
| Sum | 1805339 |
| Variance | 683.1437 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 16 | 19239 | |
| 18 | 10131 | |
| 17 | 9160 | |
| 22 | 4776 | 6.2% |
| 19 | 4640 | 6.0% |
| 20 | 4548 | 5.9% |
| 25 | 2545 | 3.3% |
| 21 | 2455 | 3.2% |
| 36 | 1793 | 2.3% |
| 30 | 1769 | 2.3% |
| Other values (234) | 16358 |
| Value | Count | Frequency (%) |
| -420 | 1 | |
| -320 | 1 | |
| -319 | 1 | |
| -256 | 1 | |
| -187 | 1 | |
| -110 | 2 | |
| -107 | 2 | |
| -74 | 1 | |
| -67 | 1 | |
| -62 | 1 |
| Value | Count | Frequency (%) |
| 2677 | 1 | |
| 2107 | 1 | |
| 1911 | 2 | |
| 1318 | 2 | |
| 1313 | 1 | |
| 1244 | 2 | |
| 1198 | 1 | |
| 1124 | 1 | |
| 1069 | 1 | |
| 1040 | 1 |
meses_desde_cadastro
Real number (ℝ)
| Distinct | 252 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 117.88077 |
| Minimum | -1 |
|---|---|
| Maximum | 250 |
| Zeros | 88 |
| Zeros (%) | 0.1% |
| Negative | 7 |
| Negative (%) | < 0.1% |
| Memory size | 1.2 MiB |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | 15 |
| Q1 | 60 |
| median | 102 |
| Q3 | 179 |
| 95-th percentile | 243 |
| Maximum | 250 |
| Range | 251 |
| Interquartile range (IQR) | 119 |
Descriptive statistics
| Standard deviation | 74.58118 |
|---|---|
| Coefficient of variation (CV) | 0.63268317 |
| Kurtosis | -1.0283621 |
| Mean | 117.88077 |
| Median Absolute Deviation (MAD) | 49 |
| Skewness | 0.4547474 |
| Sum | 9125622 |
| Variance | 5562.3524 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 91 | 750 | 1.0% |
| 93 | 724 | 0.9% |
| 104 | 721 | 0.9% |
| 95 | 721 | 0.9% |
| 97 | 720 | 0.9% |
| 107 | 702 | 0.9% |
| 99 | 700 | 0.9% |
| 92 | 693 | 0.9% |
| 94 | 692 | 0.9% |
| 101 | 688 | 0.9% |
| Other values (242) | 70303 |
| Value | Count | Frequency (%) |
| -1 | 7 | < 0.1% |
| 0 | 88 | 0.1% |
| 1 | 163 | |
| 2 | 175 | |
| 3 | 196 | |
| 4 | 227 | |
| 5 | 253 | |
| 6 | 199 | |
| 7 | 237 | |
| 8 | 274 |
| Value | Count | Frequency (%) |
| 250 | 532 | |
| 249 | 550 | |
| 248 | 538 | |
| 247 | 553 | |
| 246 | 420 | |
| 245 | 478 | |
| 244 | 534 | |
| 243 | 526 | |
| 242 | 527 | |
| 241 | 508 |
dias_desde_cadastro
Real number (ℝ)
| Distinct | 455 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3566.1491 |
| Minimum | -1 |
|---|---|
| Maximum | 7663 |
| Zeros | 88 |
| Zeros (%) | 0.1% |
| Negative | 7 |
| Negative (%) | < 0.1% |
| Memory size | 1.2 MiB |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | 369 |
| Q1 | 1825 |
| median | 2927 |
| Q3 | 5474 |
| 95-th percentile | 7303 |
| Maximum | 7663 |
| Range | 7664 |
| Interquartile range (IQR) | 3649 |
Descriptive statistics
| Standard deviation | 2291.6023 |
|---|---|
| Coefficient of variation (CV) | 0.64259855 |
| Kurtosis | -1.0082434 |
| Mean | 3566.1491 |
| Median Absolute Deviation (MAD) | 1460 |
| Skewness | 0.49440663 |
| Sum | 2.7606987 × 108 |
| Variance | 5251441 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2919 | 710 | 0.9% |
| 3284 | 702 | 0.9% |
| 2921 | 693 | 0.9% |
| 2920 | 660 | 0.9% |
| 2923 | 648 | 0.8% |
| 6939 | 642 | 0.8% |
| 2924 | 628 | 0.8% |
| 2562 | 627 | 0.8% |
| 3285 | 621 | 0.8% |
| 6937 | 621 | 0.8% |
| Other values (445) | 70862 |
| Value | Count | Frequency (%) |
| -1 | 7 | < 0.1% |
| 0 | 88 | |
| 1 | 147 | |
| 2 | 135 | |
| 3 | 156 | |
| 4 | 137 | |
| 5 | 169 | |
| 6 | 102 | |
| 7 | 105 | |
| 8 | 116 |
| Value | Count | Frequency (%) |
| 7663 | 532 | |
| 7662 | 550 | |
| 7661 | 538 | |
| 7660 | 553 | |
| 7659 | 420 | |
| 7658 | 478 | |
| 7304 | 534 | |
| 7303 | 526 | |
| 7302 | 527 | |
| 7301 | 508 |
valor_emprestimo
Real number (ℝ)
| Distinct | 70419 |
|---|---|
| Distinct (%) | 91.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 43626.239 |
| Minimum | 0.093466679 |
|---|---|
| Maximum | 4112533.9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.2 MiB |
Quantile statistics
| Minimum | 0.093466679 |
|---|---|
| 5-th percentile | 3021.7229 |
| Q1 | 17547.345 |
| median | 32541.981 |
| Q3 | 57088.05 |
| 95-th percentile | 120220.81 |
| Maximum | 4112533.9 |
| Range | 4112533.8 |
| Interquartile range (IQR) | 39540.705 |
Descriptive statistics
| Standard deviation | 43466.58 |
|---|---|
| Coefficient of variation (CV) | 0.99634029 |
| Kurtosis | 1142.5027 |
| Mean | 43626.239 |
| Median Absolute Deviation (MAD) | 16954.432 |
| Skewness | 16.249908 |
| Sum | 3.3772816 × 109 |
| Variance | 1.8893435 × 109 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1115.199547 | 195 | 0.3% |
| 1104.776147 | 192 | 0.2% |
| 1125.821507 | 127 | 0.2% |
| 933.7321245 | 80 | 0.1% |
| 942.5417492 | 73 | 0.1% |
| 892.1596377 | 65 | 0.1% |
| 1084.503165 | 54 | 0.1% |
| 883.8209178 | 52 | 0.1% |
| 1055.45138 | 50 | 0.1% |
| 951.5191923 | 47 | 0.1% |
| Other values (70409) | 76479 |
| Value | Count | Frequency (%) |
| 0.09346667913 | 1 | |
| 0.3738667165 | 1 | |
| 0.4286122488 | 1 | |
| 0.6250558086 | 1 | |
| 4.832227311 | 1 | |
| 5.238594152 | 1 | |
| 5.453344655 | 1 | |
| 5.613737145 | 1 | |
| 5.660911407 | 1 | |
| 5.92437375 | 1 |
| Value | Count | Frequency (%) |
| 4112533.882 | 1 | |
| 2122841.778 | 1 | |
| 1601607.765 | 1 | |
| 1415227.852 | 1 | |
| 1300902.14 | 1 | |
| 1262024.955 | 1 | |
| 1152490.713 | 1 | |
| 1142965.997 | 1 | |
| 1084213.478 | 1 | |
| 943485.2345 | 1 |
diff_renda
Real number (ℝ)
| Distinct | 72923 |
|---|---|
| Distinct (%) | 94.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 238564.64 |
| Minimum | -4017272 |
|---|---|
| Maximum | 1666948.5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 6221 |
| Negative (%) | 8.0% |
| Memory size | 1.2 MiB |
Quantile statistics
| Minimum | -4017272 |
|---|---|
| 5-th percentile | -19583.156 |
| Q1 | 80617.702 |
| median | 192231.27 |
| Q3 | 345820 |
| 95-th percentile | 662991.98 |
| Maximum | 1666948.5 |
| Range | 5684220.5 |
| Interquartile range (IQR) | 265202.3 |
Descriptive statistics
| Standard deviation | 219497.44 |
|---|---|
| Coefficient of variation (CV) | 0.92007533 |
| Kurtosis | 4.5285117 |
| Mean | 238564.64 |
| Median Absolute Deviation (MAD) | 127027.79 |
| Skewness | 1.1264315 |
| Sum | 1.8468243 × 1010 |
| Variance | 4.8179126 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -1182 | 29 | < 0.1% |
| -26498.5 | 13 | < 0.1% |
| 299159 | 9 | < 0.1% |
| 335996.49 | 9 | < 0.1% |
| 58693 | 9 | < 0.1% |
| 199585.51 | 8 | < 0.1% |
| 399700 | 8 | < 0.1% |
| 227340.75 | 8 | < 0.1% |
| -945.6 | 8 | < 0.1% |
| 234044.49 | 8 | < 0.1% |
| Other values (72913) | 77305 |
| Value | Count | Frequency (%) |
| -4017272 | 1 | |
| -2190198 | 1 | |
| -1619335.07 | 1 | |
| -1444077 | 1 | |
| -1267730 | 1 | |
| -1081481 | 1 | |
| -991929 | 1 | |
| -901252 | 1 | |
| -850889.2 | 1 | |
| -787125 | 1 |
| Value | Count | Frequency (%) |
| 1666948.46 | 1 | |
| 1662590.82 | 1 | |
| 1662150.07 | 1 | |
| 1645689.4 | 1 | |
| 1642923.5 | 1 | |
| 1633607 | 1 | |
| 1629676.63 | 1 | |
| 1605036.1 | 1 | |
| 1603360 | 1 | |
| 1602147.59 | 1 |
len_credit_history
Real number (ℝ)
| Distinct | 238 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 205.55786 |
| Minimum | 1 |
|---|---|
| Maximum | 1151 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.2 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 22 |
| Q1 | 82 |
| median | 129 |
| Q3 | 241 |
| 95-th percentile | 638 |
| Maximum | 1151 |
| Range | 1150 |
| Interquartile range (IQR) | 159 |
Descriptive statistics
| Standard deviation | 220.92867 |
|---|---|
| Coefficient of variation (CV) | 1.0747761 |
| Kurtosis | 6.590679 |
| Mean | 205.55786 |
| Median Absolute Deviation (MAD) | 65 |
| Skewness | 2.4746214 |
| Sum | 15913056 |
| Variance | 48809.478 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1151 | 1151 | 1.5% |
| 1055 | 1055 | 1.4% |
| 157 | 942 | 1.2% |
| 877 | 877 | 1.1% |
| 130 | 780 | 1.0% |
| 375 | 750 | 1.0% |
| 125 | 750 | 1.0% |
| 124 | 744 | 1.0% |
| 122 | 732 | 0.9% |
| 102 | 714 | 0.9% |
| Other values (228) | 68919 |
| Value | Count | Frequency (%) |
| 1 | 108 | |
| 2 | 128 | |
| 3 | 180 | |
| 4 | 220 | |
| 5 | 195 | |
| 6 | 204 | |
| 7 | 203 | |
| 8 | 192 | |
| 9 | 171 | |
| 10 | 150 |
| Value | Count | Frequency (%) |
| 1151 | 1151 | |
| 1055 | 1055 | |
| 877 | 877 | |
| 675 | 675 | |
| 638 | 638 | |
| 585 | 585 | |
| 557 | 557 | |
| 539 | 539 | |
| 510 | 510 | |
| 505 | 505 |
ultima_data_emprestimo
Real number (ℝ)
| Distinct | 382 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10.556837 |
| Minimum | 0 |
|---|---|
| Maximum | 901 |
| Zeros | 9650 |
| Zeros (%) | 12.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.2 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 2 |
| median | 4 |
| Q3 | 11 |
| 95-th percentile | 35 |
| Maximum | 901 |
| Range | 901 |
| Interquartile range (IQR) | 9 |
Descriptive statistics
| Standard deviation | 26.462226 |
|---|---|
| Coefficient of variation (CV) | 2.5066434 |
| Kurtosis | 234.62211 |
| Mean | 10.556837 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 12.226621 |
| Sum | 817247 |
| Variance | 700.24941 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 9650 | |
| 1 | 9532 | |
| 2 | 7821 | 10.1% |
| 3 | 6529 | 8.4% |
| 4 | 5475 | 7.1% |
| 5 | 4246 | 5.5% |
| 7 | 3998 | 5.2% |
| 6 | 3630 | 4.7% |
| 8 | 2533 | 3.3% |
| 9 | 2091 | 2.7% |
| Other values (372) | 21909 |
| Value | Count | Frequency (%) |
| 0 | 9650 | |
| 1 | 9532 | |
| 2 | 7821 | |
| 3 | 6529 | |
| 4 | 5475 | |
| 5 | 4246 | |
| 6 | 3630 | 4.7% |
| 7 | 3998 | |
| 8 | 2533 | 3.3% |
| 9 | 2091 | 2.7% |
| Value | Count | Frequency (%) |
| 901 | 1 | |
| 846 | 1 | |
| 837 | 1 | |
| 833 | 1 | |
| 808 | 1 | |
| 727 | 1 | |
| 719 | 2 | |
| 714 | 2 | |
| 683 | 1 | |
| 672 | 1 |
Auto
The auto setting is an interpretable pairwise column metric of the following mapping:- Variable_type-Variable_type : Method, Range
- Categorical-Categorical : Cramer's V, [0,1]
- Numerical-Categorical : Cramer's V, [0,1] (using a discretized numerical column)
- Numerical-Numerical : Spearman's ρ, [-1,1]
This configuration uses the recommended metric for each pair of columns.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.| ID_CLIENTE | SAFRA_REF | DATA_EMISSAO_DOCUMENTO | DATA_PAGAMENTO | DATA_VENCIMENTO | VALOR_A_PAGAR | TAXA | DATA_CADASTRO | DDD | FLAG_PF | SEGMENTO_INDUSTRIAL | DOMINIO_EMAIL | PORTE | CEP_2_DIG | RENDA_MES_ANTERIOR | NO_FUNCIONARIOS | tempo_sem_pagar | inadimplente | prazo_em_dias | meses_desde_cadastro | dias_desde_cadastro | valor_emprestimo | diff_renda | len_credit_history | ultima_data_emprestimo | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1661240395903230676 | 2018-08 | 2018-08-17 | 2018-09-06 | 2018-09-06 | 35516.41 | 6.99 | 2013-08-22 | 99.0 | 0 | Serviços | YAHOO | PEQUENO | 65.0 | 0.0 | 0.0 | 0 | 0 | 20 | 60 | 1825 | 33196.008973 | -35516.41 | 116.0 | 0.0 |
| 1 | 1661240395903230676 | 2018-08 | 2018-08-19 | 2018-09-11 | 2018-09-10 | 17758.21 | 6.99 | 2013-08-22 | 99.0 | 0 | Serviços | YAHOO | PEQUENO | 65.0 | 0.0 | 0.0 | 1 | 0 | 22 | 60 | 1825 | 16598.009160 | -17758.21 | 116.0 | 2.0 |
| 2 | 1661240395903230676 | 2018-08 | 2018-08-26 | 2018-09-18 | 2018-09-17 | 17431.96 | 6.99 | 2013-08-22 | 99.0 | 0 | Serviços | YAHOO | PEQUENO | 65.0 | 0.0 | 0.0 | 1 | 0 | 22 | 60 | 1825 | 16293.074119 | -17431.96 | 116.0 | 7.0 |
| 3 | 1661240395903230676 | 2018-08 | 2018-08-30 | 2018-10-11 | 2018-10-05 | 1341.00 | 6.99 | 2013-08-22 | 99.0 | 0 | Serviços | YAHOO | PEQUENO | 65.0 | 0.0 | 0.0 | 6 | 1 | 36 | 60 | 1825 | 1253.388167 | -1341.00 | 116.0 | 4.0 |
| 4 | 1661240395903230676 | 2018-08 | 2018-08-31 | 2018-09-20 | 2018-09-20 | 21309.85 | 6.99 | 2013-08-22 | 99.0 | 0 | Serviços | YAHOO | PEQUENO | 65.0 | 0.0 | 0.0 | 0 | 0 | 20 | 60 | 1825 | 19917.609122 | -21309.85 | 116.0 | 1.0 |
| 5 | 8274986328479596038 | 2018-08 | 2018-08-17 | 2018-09-25 | 2018-09-25 | 48811.35 | 6.99 | 2017-01-25 | 31.0 | 0 | Comércio | YAHOO | MEDIO | 77.0 | 0.0 | 0.0 | 0 | 0 | 39 | 19 | 372 | 45622.347883 | -48811.35 | 43.0 | 0.0 |
| 6 | 345447888460137901 | 2018-08 | 2018-08-17 | 2018-09-05 | 2018-09-05 | 55131.20 | 5.99 | 2000-08-15 | 75.0 | 0 | Serviços | HOTMAIL | PEQUENO | 48.0 | 0.0 | 0.0 | 0 | 0 | 19 | 216 | 6570 | 52015.473158 | -55131.20 | 28.0 | 0.0 |
| 7 | 1003144834589372198 | 2018-08 | 2018-08-17 | 2018-09-03 | 2018-09-03 | 85855.04 | 6.99 | 2017-08-06 | 49.0 | 0 | Serviços | OUTLOOK | PEQUENO | 89.0 | 0.0 | 0.0 | 0 | 0 | 17 | 12 | 365 | 80245.854753 | -85855.04 | 148.0 | 0.0 |
| 8 | 324916756972236008 | 2018-08 | 2018-08-17 | 2018-09-03 | 2018-09-03 | 42072.00 | 5.99 | 2011-02-14 | 88.0 | 0 | Serviços | GMAIL | GRANDE | 62.0 | 0.0 | 0.0 | 0 | 0 | 17 | 90 | 2561 | 39694.310784 | -42072.00 | 252.0 | 0.0 |
| 9 | 324916756972236008 | 2018-08 | 2018-08-19 | 2018-09-05 | 2018-09-05 | 21071.97 | 5.99 | 2011-02-14 | 88.0 | 0 | Serviços | GMAIL | GRANDE | 62.0 | 0.0 | 0.0 | 0 | 0 | 17 | 90 | 2561 | 19881.092556 | -21071.97 | 252.0 | 2.0 |
| ID_CLIENTE | SAFRA_REF | DATA_EMISSAO_DOCUMENTO | DATA_PAGAMENTO | DATA_VENCIMENTO | VALOR_A_PAGAR | TAXA | DATA_CADASTRO | DDD | FLAG_PF | SEGMENTO_INDUSTRIAL | DOMINIO_EMAIL | PORTE | CEP_2_DIG | RENDA_MES_ANTERIOR | NO_FUNCIONARIOS | tempo_sem_pagar | inadimplente | prazo_em_dias | meses_desde_cadastro | dias_desde_cadastro | valor_emprestimo | diff_renda | len_credit_history | ultima_data_emprestimo | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 77404 | 511305041047444825 | 2021-06 | 2021-06-30 | 2021-07-16 | 2021-07-16 | 49318.20 | 11.99 | 2011-02-14 | 79.0 | 0 | Serviços | YAHOO | GRANDE | 49.0 | 183354.0 | 87.0 | 0 | 0 | 16 | 124 | 3654 | 44038.039111 | 134035.80 | 82.0 | 27.0 |
| 77405 | 8252215766413781202 | 2021-06 | 2021-06-30 | 2021-08-16 | 2021-08-16 | 65599.60 | 5.99 | 2011-02-14 | 53.0 | 0 | Comércio | YAHOO | MEDIO | 96.0 | 1178103.0 | 109.0 | 0 | 0 | 47 | 124 | 3654 | 61892.253986 | 1112503.40 | 40.0 | 41.0 |
| 77406 | 8480109508191086169 | 2021-06 | 2021-06-30 | 2021-07-16 | 2021-07-16 | 95487.05 | 5.99 | 2011-02-14 | 11.0 | 0 | Indústria | GMAIL | PEQUENO | 40.0 | 241007.0 | 126.0 | 0 | 0 | 16 | 124 | 3654 | 90090.621757 | 145519.95 | 77.0 | 15.0 |
| 77407 | 7686361238195690925 | 2021-06 | 2021-06-30 | 2021-07-16 | 2021-07-16 | 25979.95 | 6.99 | 2014-02-02 | 49.0 | 0 | Serviços | AOL | PEQUENO | 89.0 | 445981.0 | 145.0 | 0 | 0 | 16 | 88 | 2559 | 24282.596504 | 420001.05 | 146.0 | 17.0 |
| 77408 | 4530631557358349711 | 2021-06 | 2021-06-30 | 2021-07-16 | 2021-07-16 | 63971.51 | 5.99 | 2000-08-15 | 11.0 | 0 | Serviços | HOTMAIL | PEQUENO | 55.0 | 139142.0 | 116.0 | 0 | 0 | 16 | 250 | 7663 | 60356.175111 | 75170.49 | 125.0 | 16.0 |
| 77409 | 2951563549197799278 | 2021-06 | 2021-06-30 | 2021-07-16 | 2021-07-16 | 89980.00 | 5.99 | 2000-08-15 | 11.0 | 0 | Comércio | AOL | PEQUENO | 13.0 | 280343.0 | 161.0 | 0 | 0 | 16 | 250 | 7663 | 84894.801396 | 190363.00 | 127.0 | 31.0 |
| 77410 | 5220206408301580591 | 2021-06 | 2021-06-30 | 2021-08-16 | 2021-08-16 | 42239.00 | 5.99 | 2021-04-08 | 19.0 | 0 | Indústria | GMAIL | GRANDE | 25.0 | 235315.0 | 87.0 | 0 | 0 | 47 | 2 | 2 | 39851.872818 | 193076.00 | 2.0 | 21.0 |
| 77411 | 5860276371789140450 | 2021-06 | 2021-06-30 | 2021-07-16 | 2021-07-16 | 20921.50 | 5.99 | 2011-02-15 | 91.0 | 0 | Serviços | HOTMAIL | GRANDE | 67.0 | 100006.0 | 126.0 | 0 | 0 | 16 | 124 | 3654 | 19739.126333 | 79084.50 | 161.0 | 17.0 |
| 77412 | 2814790209436551216 | 2021-06 | 2021-06-30 | 2021-07-16 | 2021-07-16 | 90231.05 | 6.99 | 2021-05-13 | 1.0 | 0 | Serviços | YAHOO | MEDIO | 14.0 | 0.0 | 0.0 | 0 | 0 | 16 | 1 | 1 | 84335.965978 | -90231.05 | 1.0 | 0.0 |
| 77413 | 8343941262792249232 | 2021-06 | 2021-06-30 | 2021-08-16 | 2021-08-16 | 20736.51 | 4.99 | 2019-05-28 | 11.0 | 0 | Indústria | HOTMAIL | GRANDE | 31.0 | 97599.0 | 116.0 | 0 | 0 | 47 | 25 | 731 | 19750.938185 | 76862.49 | 100.0 | 15.0 |